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Abstract 

We  present  Tor,  a  circuit-based  low-latency  anonymous  com¬ 
munication  service.  This  second-generation  Onion  Routing 
system  addresses  limitations  in  the  original  design  by  adding 
perfect  forward  secrecy,  congestion  control,  directory  servers, 
integrity  checking,  configurable  exit  policies,  and  a  practi¬ 
cal  design  for  location-hidden  services  via  rendezvous  points. 
Tor  works  on  the  real-world  Internet,  requires  no  special  priv¬ 
ileges  or  kernel  modifications,  requires  little  synchronization 
or  coordination  between  nodes,  and  provides  a  reasonable 
tradeoff  between  anonymity,  usability,  and  efficiency.  We 
briefly  describe  our  experiences  with  an  international  network 
of  more  than  30  nodes.  We  close  with  a  list  of  open  problems 
in  anonymous  communication. 

1  Overview 

Onion  Routing  is  a  distributed  overlay  network  designed  to 
anonymize  TCP-based  applications  like  web  browsing,  se¬ 
cure  shell,  and  instant  messaging.  Clients  choose  a  path 
through  the  network  and  build  a  circuit ,  in  which  each  node 
(or  “onion  router”  or  “OR”)  in  the  path  knows  its  predecessor 
and  successor,  but  no  other  nodes  in  the  circuit.  Traffic  flows 
down  the  circuit  in  fixed-size  cells ,  which  are  unwrapped  by  a 
symmetric  key  at  each  node  (like  the  layers  of  an  onion)  and 
relayed  downstream.  The  Onion  Routing  project  published 
several  design  and  analysis  papers  [27,  41,  48,  49].  While  a 
wide  area  Onion  Routing  network  was  deployed  briefly,  the 
only  long-running  public  implementation  was  a  fragile  proof- 
of-concept  that  ran  on  a  single  machine.  Even  this  simple 
deployment  processed  connections  from  over  sixty  thousand 
distinct  IP  addresses  from  all  over  the  world  at  a  rate  of  about 
fifty  thousand  per  day.  But  many  critical  design  and  deploy¬ 
ment  issues  were  never  resolved,  and  the  design  has  not  been 
updated  in  years.  Here  we  describe  Tor,  a  protocol  for  asyn¬ 
chronous,  loosely  federated  onion  routers  that  provides  the 
following  improvements  over  the  old  Onion  Routing  design: 

Perfect  forward  secrecy:  In  the  original  Onion  Routing 
design,  a  single  hostile  node  could  record  traffic  and  later 


compromise  successive  nodes  in  the  circuit  and  force  them 
to  decrypt  it.  Rather  than  using  a  single  multiply  encrypted 
data  structure  (an  onion )  to  lay  each  circuit.  Tor  now  uses  an 
incremental  or  telescoping  path-building  design,  where  the 
initiator  negotiates  session  keys  with  each  successive  hop  in 
the  circuit.  Once  these  keys  are  deleted,  subsequently  com¬ 
promised  nodes  cannot  decrypt  old  traffic.  As  a  side  benefit, 
onion  replay  detection  is  no  longer  necessary,  and  the  process 
of  building  circuits  is  more  reliable,  since  the  initiator  knows 
when  a  hop  fails  and  can  then  try  extending  to  a  new  node. 

Separation  of  “protocol  cleaning”  from  anonymity: 
Onion  Routing  originally  required  a  separate  “application 
proxy”  for  each  supported  application  protocol — most  of 
which  were  never  written,  so  many  applications  were  never 
supported.  Tor  uses  the  standard  and  near-ubiquitous 
SOCKS  [32]  proxy  interface,  allowing  us  to  support  most 
TCP-based  programs  without  modification.  Tor  now  relies  on 
the  filtering  features  of  privacy-enhancing  application-level 
proxies  such  as  Privoxy  [39],  without  trying  to  duplicate  those 
features  itself. 

No  mixing,  padding,  or  traffic  shaping  (yet):  Onion 
Routing  originally  called  for  batching  and  reordering  cells 
as  they  arrived,  assumed  padding  between  ORs,  and  in  later 
designs  added  padding  between  onion  proxies  (users)  and 
ORs  [27,  41],  Tradeoffs  between  padding  protection  and 
cost  were  discussed,  and  traffic  shaping  algorithms  were 
theorized  [49]  to  provide  good  security  without  expensive 
padding,  but  no  concrete  padding  scheme  was  suggested.  Re¬ 
cent  research  [1]  and  deployment  experience  [4]  suggest  that 
this  level  of  resource  use  is  not  practical  or  economical;  and 
even  full  link  padding  is  still  vulnerable  [33],  Thus,  until  we 
have  a  proven  and  convenient  design  for  traffic  shaping  or 
low-latency  mixing  that  improves  anonymity  against  a  realis¬ 
tic  adversary,  we  leave  these  strategies  out. 

Many  TCP  streams  can  share  one  circuit:  Onion  Rout¬ 
ing  originally  built  a  separate  circuit  for  each  application- 
level  request,  but  this  required  multiple  public  key  operations 
for  every  request,  and  also  presented  a  threat  to  anonymity 
from  building  so  many  circuits;  see  Section  9.  Tor  multi- 
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plexes  multiple  TCP  streams  along  each  circuit  to  improve 
efficiency  and  anonymity. 

Leaky-pipe  circuit  topology:  Through  in-band  signaling 
within  the  circuit.  Tor  initiators  can  direct  traffic  to  nodes 
partway  down  the  circuit.  This  novel  approach  allows  traf¬ 
fic  to  exit  the  circuit  from  the  middle — possibly  frustrating 
traffic  shape  and  volume  attacks  based  on  observing  the  end 
of  the  circuit.  (It  also  allows  for  long-range  padding  if  future 
research  shows  this  to  be  worthwhile.) 

Congestion  control:  Earlier  anonymity  designs  do  not  ad¬ 
dress  traffic  bottlenecks.  Unfortunately,  typical  approaches  to 
load  balancing  and  flow  control  in  overlay  networks  involve 
inter-node  control  communication  and  global  views  of  traffic. 
Tor’s  decentralized  congestion  control  uses  end-to-end  acks 
to  maintain  anonymity  while  allowing  nodes  at  the  edges  of 
the  network  to  detect  congestion  or  flooding  and  send  less 
data  until  the  congestion  subsides. 

Directory  servers:  The  earlier  Onion  Routing  design 
planned  to  flood  state  information  through  the  network — an 
approach  that  can  be  unreliable  and  complex.  Tor  takes  a 
simplified  view  toward  distributing  this  information.  Cer¬ 
tain  more  trusted  nodes  act  as  directory  servers:  they  provide 
signed  directories  describing  known  routers  and  their  current 
state.  Users  periodically  download  them  via  HTTP. 

Variable  exit  policies:  Tor  provides  a  consistent  mecha¬ 
nism  for  each  node  to  advertise  a  policy  describing  the  hosts 
and  ports  to  which  it  will  connect.  These  exit  policies  are  crit¬ 
ical  in  a  volunteer-based  distributed  infrastructure,  because 
each  operator  is  comfortable  with  allowing  different  types  of 
traffic  to  exit  from  his  node. 

End-to-end  integrity  checking:  The  original  Onion  Rout¬ 
ing  design  did  no  integrity  checking  on  data.  Any  node  on  the 
circuit  could  change  the  contents  of  data  cells  as  they  passed 
by — for  example,  to  alter  a  connection  request  so  it  would 
connect  to  a  different  Webserver,  or  to  ‘tag’  encrypted  traffic 
and  look  for  corresponding  corrupted  traffic  at  the  network 
edges  [15].  Tor  hampers  these  attacks  by  verifying  data  in¬ 
tegrity  before  it  leaves  the  network. 

Rendezvous  points  and  hidden  services:  Tor  provides  an 
integrated  mechanism  for  responder  anonymity  via  location- 
protected  servers.  Previous  Onion  Routing  designs  included 
long-lived  “reply  onions’’  that  could  be  used  to  build  circuits 
to  a  hidden  server,  but  these  reply  onions  did  not  provide  for¬ 
ward  security,  and  became  useless  if  any  node  in  the  path 
went  down  or  rotated  its  keys.  In  Tor,  clients  negotiate  ren¬ 
dezvous  points  to  connect  with  hidden  servers;  reply  onions 
are  no  longer  required. 

Unlike  Freedom  [8],  Tor  does  not  require  OS  kernel 
patches  or  network  stack  support.  This  prevents  us  from 
anonymizing  non-TCP  protocols,  but  has  greatly  helped  our 
portability  and  deployability. 

We  have  implemented  all  of  the  above  features,  including 
rendezvous  points.  Our  source  code  is  available  under  a  free 
license,  and  Tor  is  not  covered  by  the  patent  that  affected  dis¬ 


tribution  and  use  of  earlier  versions  of  Onion  Routing.  We 
have  deployed  a  wide-area  alpha  network  to  test  the  design,  to 
get  more  experience  with  usability  and  users,  and  to  provide 
a  research  platform  for  experimentation.  As  of  this  writing, 
the  network  stands  at  32  nodes  spread  over  two  continents. 

We  review  previous  work  in  Section  2,  describe  our  goals 
and  assumptions  in  Section  3,  and  then  address  the  above  list 
of  improvements  in  Sections  4,  5,  and  6.  We  summarize  in 
Section  7  how  our  design  stands  up  to  known  attacks,  and 
talk  about  our  early  deployment  experiences  in  Section  8.  We 
conclude  with  a  list  of  open  problems  in  Section  9  and  future 
work  for  the  Onion  Routing  project  in  Section  10. 

2  Related  work 

Modern  anonymity  systems  date  to  Chaum’s  Mix-Net  de¬ 
sign  [10].  Chaum  proposed  hiding  the  correspondence  be¬ 
tween  sender  and  recipient  by  wrapping  messages  in  layers 
of  public-key  cryptography,  and  relaying  them  through  a  path 
composed  of  “mixes.”  Each  mix  in  turn  decrypts,  delays,  and 
re-orders  messages  before  relaying  them  onward. 

Subsequent  relay-based  anonymity  designs  have  diverged 
in  two  main  directions.  Systems  like  Babel  [28],  Mix- 
master  [36],  and  Mixminion  [15]  have  tried  to  maximize 
anonymity  at  the  cost  of  introducing  comparatively  large 
and  variable  latencies.  Because  of  this  decision,  these  high- 
latency  networks  resist  strong  global  adversaries,  but  intro¬ 
duce  too  much  lag  for  interactive  tasks  like  web  browsing, 
Internet  chat,  or  SSH  connections. 

Tor  belongs  to  the  second  category:  low-latency  designs 
that  try  to  anonymize  interactive  network  traffic.  These  sys¬ 
tems  handle  a  variety  of  bidirectional  protocols.  They  also 
provide  more  convenient  mail  delivery  than  the  high-latency 
anonymous  email  networks,  because  the  remote  mail  server 
provides  explicit  and  timely  delivery  confirmation.  But  be¬ 
cause  these  designs  typically  involve  many  packets  that  must 
be  delivered  quickly,  it  is  difficult  for  them  to  prevent  an  at¬ 
tacker  who  can  eavesdrop  both  ends  of  the  communication 
from  correlating  the  timing  and  volume  of  traffic  entering  the 
anonymity  network  with  traffic  leaving  it  [45],  These  proto¬ 
cols  are  similarly  vulnerable  to  an  active  adversary  who  in¬ 
troduces  timing  patterns  into  traffic  entering  the  network  and 
looks  for  correlated  patterns  among  exiting  traffic.  Although 
some  work  has  been  done  to  frustrate  these  attacks,  most  de¬ 
signs  protect  primarily  against  traffic  analysis  rather  than  traf¬ 
fic  confirmation  (see  Section  3.1). 

The  simplest  low-latency  designs  are  single-hop  proxies 
such  as  the  Anonymizer  [3]:  a  single  trusted  server  strips 
the  data’s  origin  before  relaying  it.  These  designs  are  easy  to 
analyze,  but  users  must  trust  the  anonymizing  proxy.  Concen¬ 
trating  the  traffic  to  this  single  point  increases  the  anonymity 
set  (the  people  a  given  user  is  hiding  among),  but  it  is  vul¬ 
nerable  if  the  adversary  can  observe  all  traffic  entering  and 
leaving  the  proxy. 


More  complex  are  distributed-trust,  circuit-based 
anonymizing  systems.  In  these  designs,  a  user  estab¬ 
lishes  one  or  more  medium-term  bidirectional  end-to-end 
circuits,  and  tunnels  data  in  fixed-size  cells.  Establishing 
circuits  is  computationally  expensive  and  typically  requires 
public-key  cryptography,  whereas  relaying  cells  is  compar¬ 
atively  inexpensive  and  typically  requires  only  symmetric 
encryption.  Because  a  circuit  crosses  several  servers,  and 
each  server  only  knows  the  adjacent  servers  in  the  circuit,  no 
single  server  can  link  a  user  to  her  communication  partners. 

The  Java  Anon  Proxy  (also  known  as  JAP  or  Web  MIXes) 
uses  fixed  shared  routes  known  as  cascades.  As  with  a 
single-hop  proxy,  this  approach  aggregates  users  into  larger 
anonymity  sets,  but  again  an  attacker  only  needs  to  observe 
both  ends  of  the  cascade  to  bridge  all  the  system’s  traffic.  The 
Java  Anon  Proxy’s  design  calls  for  padding  between  end  users 
and  the  head  of  the  cascade  [7].  However,  it  is  not  demon¬ 
strated  whether  the  current  implementation’s  padding  policy 
improves  anonymity. 

PipeNet  [5,  12],  another  low-latency  design  proposed 
around  the  same  time  as  Onion  Routing,  gave  stronger 
anonymity  but  allowed  a  single  user  to  shut  down  the  net¬ 
work  by  not  sending.  Systems  like  ISDN  mixes  [38]  were 
designed  for  other  environments  with  different  assumptions. 

In  P2P  designs  like  Tarzan  [24]  and  MorphMix  [43],  all 
participants  both  generate  traffic  and  relay  traffic  for  others. 
These  systems  aim  to  conceal  whether  a  given  peer  originated 
a  request  or  just  relayed  it  from  another  peer.  While  Tarzan 
and  MorphMix  use  layered  encryption  as  above.  Crowds  [42] 
simply  assumes  an  adversary  who  cannot  observe  the  initia¬ 
tor:  it  uses  no  public-key  encryption,  so  any  node  on  a  circuit 
can  read  users’  traffic. 

Hordes  [34]  is  based  on  Crowds  but  also  uses  multicast 
responses  to  hide  the  initiator.  Herbivore  [25]  and  P5  [46] 
go  even  further,  requiring  broadcast.  These  systems  are  de¬ 
signed  primarily  for  communication  among  peers,  although 
Herbivore  users  can  make  external  connections  by  requesting 
a  peer  to  serve  as  a  proxy. 

Systems  like  Freedom  and  the  original  Onion  Routing 
build  circuits  all  at  once,  using  a  layered  “onion”  of  public- 
key  encrypted  messages,  each  layer  of  which  provides  ses¬ 
sion  keys  and  the  address  of  the  next  server  in  the  circuit. 
Tor  as  described  herein,  Tarzan,  MorphMix,  Cebolla  [9], 
and  Rennhard’s  Anonymity  Network  [44]  build  circuits  in 
stages,  extending  them  one  hop  at  a  time.  Section  4.2  de¬ 
scribes  how  this  approach  enables  perfect  forward  secrecy. 

Circuit-based  designs  must  choose  which  protocol  layer  to 
anonymize.  They  may  intercept  IP  packets  directly,  and  re¬ 
lay  them  whole  (stripping  the  source  address)  along  the  cir¬ 
cuit  [8,  24],  Like  Tor,  they  may  accept  TCP  streams  and 
relay  the  data  in  those  streams,  ignoring  the  breakdown  of 
that  data  into  TCP  segments  [43,  44].  Finally,  like  Crowds, 
they  may  accept  application-level  protocols  such  as  HTTP 
and  relay  the  application  requests  themselves.  Making  this 


protocol-layer  decision  requires  a  compromise  between  flexi¬ 
bility  and  anonymity.  For  example,  a  system  that  understands 
HTTP  can  strip  identifying  information  from  requests,  can 
take  advantage  of  caching  to  limit  the  number  of  requests  that 
leave  the  network,  and  can  batch  or  encode  requests  to  min¬ 
imize  the  number  of  connections.  On  the  other  hand,  an  IP- 
level  anonymizer  can  handle  nearly  any  protocol,  even  ones 
unforeseen  by  its  designers  (though  these  systems  require 
kernel-level  modifications  to  some  operating  systems,  and  so 
are  more  complex  and  less  portable).  TCP-level  anonymity 
networks  like  Tor  present  a  middle  approach:  they  are  ap¬ 
plication  neutral  (so  long  as  the  application  supports,  or  can 
be  tunneled  across,  TCP),  but  by  treating  application  connec¬ 
tions  as  data  streams  rather  than  raw  TCP  packets,  they  avoid 
the  inefficiencies  of  tunneling  TCP  over  TCP. 

Distributed-trust  anonymizing  systems  need  to  prevent  at¬ 
tackers  from  adding  too  many  servers  and  thus  compromising 
user  paths.  Tor  relies  on  a  small  set  of  well-known  directory 
servers,  run  by  independent  parties,  to  decide  which  nodes 
can  join.  Tarzan  and  MorphMix  allow  unknown  users  to  run 
servers,  and  use  a  limited  resource  (like  IP  addresses)  to  pre¬ 
vent  an  attacker  from  controlling  too  much  of  the  network. 
Crowds  suggests  requiring  written,  notarized  requests  from 
potential  crowd  members. 

Anonymous  communication  is  essential  for  censorship- 
resistant  systems  like  Eternity  [2],  Free  Haven  [19],  Pub¬ 
lius  [53],  and  Tangier  [52],  Tor’s  rendezvous  points  enable 
connections  between  mutually  anonymous  entities;  they  are  a 
building  block  for  location-hidden  servers,  which  are  needed 
by  Eternity  and  Free  Haven. 

3  Design  goals  and  assumptions 
Goals 

Like  other  low-latency  anonymity  designs.  Tor  seeks  to  frus¬ 
trate  attackers  from  linking  communication  partners,  or  from 
linking  multiple  communications  to  or  from  a  single  user. 
Within  this  main  goal,  however,  several  considerations  have 
directed  Tor’s  evolution. 

Deployability:  The  design  must  be  deployed  and  used  in 
the  real  world.  Thus  it  must  not  be  expensive  to  run  (for 
example,  by  requiring  more  bandwidth  than  volunteers  are 
willing  to  provide);  must  not  place  a  heavy  liability  burden 
on  operators  (for  example,  by  allowing  attackers  to  implicate 
onion  routers  in  illegal  activities);  and  must  not  be  difficult 
or  expensive  to  implement  (for  example,  by  requiring  kernel 
patches,  or  separate  proxies  for  every  protocol).  We  also  can¬ 
not  require  non-anonymous  parties  (such  as  websites)  to  run 
our  software.  (Our  rendezvous  point  design  does  not  meet 
this  goal  for  non-anonymous  users  talking  to  hidden  servers, 
however;  see  Section  5.) 

Usability:  A  hard-to-use  system  has  fewer  users — and  be¬ 
cause  anonymity  systems  hide  users  among  users,  a  system 
with  fewer  users  provides  less  anonymity.  Usability  is  thus 


not  only  a  convenience:  it  is  a  security  requirement  [1,  5], 
Tor  should  therefore  not  require  modifying  familiar  applica¬ 
tions;  should  not  introduce  prohibitive  delays;  and  should  re¬ 
quire  as  few  configuration  decisions  as  possible.  Finally,  Tor 
should  be  easily  implementable  on  all  common  platforms;  we 
cannot  require  users  to  change  their  operating  system  to  be 
anonymous.  (Tor  currently  runs  on  Win32,  Linux,  Solaris, 
BSD-style  Unix,  MacOS  X,  and  probably  others.) 

Flexibility:  The  protocol  must  be  flexible  and  well- 
specified,  so  Tor  can  serve  as  a  test-bed  for  future  research. 
Many  of  the  open  problems  in  low-latency  anonymity  net¬ 
works,  such  as  generating  dummy  traffic  or  preventing  Sybil 
attacks  [22],  may  be  solvable  independently  from  the  issues 
solved  by  Tor.  Hopefully  future  systems  will  not  need  to  rein¬ 
vent  Tor’s  design. 

Simple  design:  The  protocol’s  design  and  security  param¬ 
eters  must  be  well-understood.  Additional  features  impose 
implementation  and  complexity  costs;  adding  unproven 
techniques  to  the  design  threatens  deployability,  readability, 
and  ease  of  security  analysis.  Tor  aims  to  deploy  a  simple  and 
stable  system  that  integrates  the  best  accepted  approaches  to 
protecting  anonymity. 

Non-goals 

In  favoring  simple,  deployable  designs,  we  have  explicitly  de¬ 
ferred  several  possible  goals,  either  because  they  are  solved 
elsewhere,  or  because  they  are  not  yet  solved. 

Not  peer-to-peer:  Tarzan  and  MorphMix  aim  to  scale 
to  completely  decentralized  peer-to-peer  environments  with 
thousands  of  short-lived  servers,  many  of  which  may  be  con¬ 
trolled  by  an  adversary.  This  approach  is  appealing,  but  still 
has  many  open  problems  [24,  43]. 

Not  secure  against  end-to-end  attacks:  Tor  does  not 
claim  to  completely  solve  end-to-end  timing  or  intersection 
attacks.  Some  approaches,  such  as  having  users  run  their  own 
onion  routers,  may  help;  see  Section  9  for  more  discussion. 

No  protocol  normalization:  Tor  does  not  provide  proto¬ 
col  normalization  like  Privoxy  or  the  Anonymizer.  If  senders 
want  anonymity  from  responders  while  using  complex  and 
variable  protocols  like  HTTP,  Tor  must  be  layered  with  a 
filtering  proxy  such  as  Privoxy  to  hide  differences  between 
clients,  and  expunge  protocol  features  that  leak  identity.  Note 
that  by  this  separation  Tor  can  also  provide  services  that  are 
anonymous  to  the  network  yet  authenticated  to  the  responder, 
like  SSH.  Similarly,  Tor  does  not  integrate  tunneling  for  non- 
stream-based  protocols  like  UDP;  this  must  be  provided  by 
an  external  service  if  appropriate. 

Not  steganographic:  Tor  does  not  try  to  conceal  who  is 
connected  to  the  network. 

3.1  Threat  Model 

A  global  passive  adversary  is  the  most  commonly  assumed 
threat  when  analyzing  theoretical  anonymity  designs.  But 


like  all  practical  low-latency  systems.  Tor  does  not  protect 
against  such  a  strong  adversary.  Instead,  we  assume  an  adver¬ 
sary  who  can  observe  some  fraction  of  network  traffic;  who 
can  generate,  modify,  delete,  or  delay  traffic;  who  can  oper¬ 
ate  onion  routers  of  his  own;  and  who  can  compromise  some 
fraction  of  the  onion  routers. 

In  low-latency  anonymity  systems  that  use  layered  encryp¬ 
tion,  the  adversary’s  typical  goal  is  to  observe  both  the  ini¬ 
tiator  and  the  responder.  By  observing  both  ends,  passive  at¬ 
tackers  can  confirm  a  suspicion  that  Alice  is  talking  to  Bob  if 
the  timing  and  volume  patterns  of  the  traffic  on  the  connec¬ 
tion  are  distinct  enough;  active  attackers  can  induce  timing 
signatures  on  the  traffic  to  force  distinct  patterns.  Rather  than 
focusing  on  these  traffic  confirmation  attacks,  we  aim  to  pre¬ 
vent  traffic  analysis  attacks,  where  the  adversary  uses  traffic 
patterns  to  learn  which  points  in  the  network  he  should  attack. 

Our  adversary  might  try  to  link  an  initiator  Alice  with  her 
communication  partners,  or  try  to  build  a  profile  of  Alice’s 
behavior.  He  might  mount  passive  attacks  by  observing  the 
network  edges  and  correlating  traffic  entering  and  leaving  the 
network — by  relationships  in  packet  timing,  volume,  or  ex¬ 
ternally  visible  user-selected  options.  The  adversary  can  also 
mount  active  attacks  by  compromising  routers  or  keys;  by  re¬ 
playing  traffic;  by  selectively  denying  service  to  trustworthy 
routers  to  move  users  to  compromised  routers,  or  denying  ser¬ 
vice  to  users  to  see  if  traffic  elsewhere  in  the  network  stops;  or 
by  introducing  patterns  into  traffic  that  can  later  be  detected. 
The  adversary  might  subvert  the  directory  servers  to  give 
users  differing  views  of  network  state.  Additionally,  he  can 
try  to  decrease  the  network’s  reliability  by  attacking  nodes 
or  by  performing  antisocial  activities  from  reliable  nodes  and 
trying  to  get  them  taken  down — making  the  network  unre¬ 
liable  flushes  users  to  other  less  anonymous  systems,  where 
they  may  be  easier  to  attack.  We  summarize  in  Section  7  how 
well  the  Tor  design  defends  against  each  of  these  attacks. 

4  The  Tor  Design 

The  Tor  network  is  an  overlay  network;  each  onion  router 
(OR)  runs  as  a  normal  user-level  process  without  any  special 
privileges.  Each  onion  router  maintains  a  TLS  [17]  connec¬ 
tion  to  every  other  onion  router.  Each  user  runs  local  software 
called  an  onion  proxy  (OP)  to  fetch  directories,  establish  cir¬ 
cuits  across  the  network,  and  handle  connections  from  user 
applications.  These  onion  proxies  accept  TCP  streams  and 
multiplex  them  across  the  circuits.  The  onion  router  on  the 
other  side  of  the  circuit  connects  to  the  requested  destinations 
and  relays  data. 

Each  onion  router  maintains  a  long-term  identity  key  and 
a  short-term  onion  key.  The  identity  key  is  used  to  sign  TLS 
certificates,  to  sign  the  OR’s  router  descriptor  (a  summary  of 
its  keys,  address,  bandwidth,  exit  policy,  and  so  on),  and  (by 
directory  servers)  to  sign  directories.  The  onion  key  is  used 
to  decrypt  requests  from  users  to  set  up  a  circuit  and  negotiate 


ephemeral  keys.  The  TLS  protocol  also  establishes  a  short¬ 
term  link  key  when  communicating  between  ORs.  Short-term 
keys  are  rotated  periodically  and  independently,  to  limit  the 
impact  of  key  compromise. 

Section  4.1  presents  the  fixed-size  cells  that  are  the  unit 
of  communication  in  Tor.  We  describe  in  Section  4.2  how 
circuits  are  built,  extended,  truncated,  and  destroyed.  Sec¬ 
tion  4.3  describes  how  TCP  streams  are  routed  through  the 
network.  We  address  integrity  checking  in  Section  4.4,  and 
resource  limiting  in  Section  4.5.  Finally,  Section  4.6  talks 
about  congestion  control  and  fairness  issues. 

4.1  Cells 

Onion  routers  communicate  with  one  another,  and  with  users’ 
OPs,  via  TLS  connections  with  ephemeral  keys.  Using  TLS 
conceals  the  data  on  the  connection  with  perfect  forward  se¬ 
crecy,  and  prevents  an  attacker  from  modifying  data  on  the 
wire  or  impersonating  an  OR. 

Traffic  passes  along  these  connections  in  fixed-size  cells. 
Each  cell  is  512  bytes,  and  consists  of  a  header  and  a  pay- 
load.  The  header  includes  a  circuit  identifier  (circID)  that 
specifies  which  circuit  the  cell  refers  to  (many  circuits  can 
be  multiplexed  over  the  single  TLS  connection),  and  a  com¬ 
mand  to  describe  what  to  do  with  the  cell’s  payload.  (Circuit 
identifiers  are  connection-specific:  each  circuit  has  a  differ¬ 
ent  circID  on  each  OP/OR  or  OR/OR  connection  it  traverses.) 
Based  on  their  command,  cells  are  either  control  cells,  which 
are  always  interpreted  by  the  node  that  receives  them,  or  re¬ 
lay  cells,  which  carry  end-to-end  stream  data.  The  control 
cell  commands  are:  padding  (currently  used  for  keepalive, 
but  also  usable  for  link  padding);  create  or  created  (used  to 
set  up  a  new  circuit);  and  destroy  (to  tear  down  a  circuit). 

Relay  cells  have  an  additional  header  (the  relay  header)  at 
the  front  of  the  payload,  containing  a  streamID  (stream  iden¬ 
tifier:  many  streams  can  be  multiplexed  over  a  circuit);  an 
end-to-end  checksum  for  integrity  checking;  the  length  of  the 
relay  payload;  and  a  relay  command.  The  entire  contents  of 
the  relay  header  and  the  relay  cell  payload  are  encrypted  or 
decrypted  together  as  the  relay  cell  moves  along  the  circuit, 
using  the  128-bit  AES  cipher  in  counter  mode  to  generate  a 
cipher  stream.  The  relay  commands  are:  relay  data  (for  data 
flowing  down  the  stream),  relay  begin  (to  open  a  stream),  re¬ 
lay  end  (to  close  a  stream  cleanly),  relay  teardown  (to  close  a 
broken  stream),  relay  connected  (to  notify  the  OP  that  a  relay 
begin  has  succeeded),  relay  extend  and  relay  extended  (to  ex¬ 
tend  the  circuit  by  a  hop,  and  to  acknowledge),  relay  truncate 
and  relay  truncated  (to  tear  down  only  part  of  the  circuit,  and 
to  acknowledge),  relay  sendme  (used  for  congestion  control), 
and  relay  drop  (used  to  implement  long-range  dummies).  We 
give  a  visual  overview  of  cell  structure  plus  the  details  of  re¬ 
lay  cell  structure,  and  then  describe  each  of  these  cell  types 
and  commands  in  more  detail  below. 


Onion  Routing  originally  built  one  circuit  for  each  TCP 
stream.  Because  building  a  circuit  can  take  several  tenths 
of  a  second  (due  to  public-key  cryptography  and  network  la¬ 
tency),  this  design  imposed  high  costs  on  applications  like 
web  browsing  that  open  many  TCP  streams. 

In  Tor,  each  circuit  can  be  shared  by  many  TCP  streams. 
To  avoid  delays,  users  construct  circuits  preemptively.  To 
limit  linkability  among  their  streams,  users’  OPs  build  a  new 
circuit  periodically  if  the  previous  ones  have  been  used,  and 
expire  old  used  circuits  that  no  longer  have  any  open  streams. 
OPs  consider  rotating  to  a  new  circuit  once  a  minute:  thus 
even  heavy  users  spend  negligible  time  building  circuits,  but 
a  limited  number  of  requests  can  be  linked  to  each  other 
through  a  given  exit  node.  Also,  because  circuits  are  built  in 
the  background,  OPs  can  recover  from  failed  circuit  creation 
without  harming  user  experience. 

Alice  (link  is  TLS-encrypted)  OR  1  (link  is  TLS-encryped)  OR  2  (unencrypted)  website 


Figure  1 :  Alice  builds  a  two-hop  circuit  and  begins  fetching 
a  web  page. 

Constructing  a  circuit 

A  user’s  OP  constructs  circuits  incrementally,  negotiating  a 
symmetric  key  with  each  OR  on  the  circuit,  one  hop  at  a  time. 
To  begin  creating  a  new  circuit,  the  OP  (call  her  Alice)  sends 
a  create  cell  to  the  first  node  in  her  chosen  path  (call  him 
Bob).  (She  chooses  a  new  circID  Cab  not  currently  used  on 
the  connection  from  her  to  Bob.)  The  create  cell’s  payload 
contains  the  first  half  of  the  Diffie-Hellman  handshake  (gx), 
encrypted  to  the  onion  key  of  the  OR  (call  him  Bob).  Bob 
responds  with  a  created  cell  containing  gy  along  with  a  hash 
of  the  negotiated  key  K  =  gxy . 

Once  the  circuit  has  been  established,  Alice  and  Bob  can 
send  one  another  relay  cells  encrypted  with  the  negotiated 


key. 1  More  detail  is  given  in  the  next  section. 

To  extend  the  circuit  further,  Alice  sends  a  relay  extend  cell 
to  Bob,  specifying  the  address  of  the  next  OR  (call  her  Carol), 
and  an  encrypted  <fr'2  for  her.  Bob  copies  the  half-handshake 
into  a  create  cell,  and  passes  it  to  Carol  to  extend  the  cir¬ 
cuit.  (Bob  chooses  a  new  circID  Cbc  n°t  currently  used  on 
the  connection  between  him  and  Carol.  Alice  never  needs  to 
know  this  circID;  only  Bob  associates  Cab  on  his  connec¬ 
tion  with  Alice  to  Cbc  on  his  connection  with  Carol.)  When 
Carol  responds  with  a  created  cell.  Bob  wraps  the  payload 
into  a  relay  extended  cell  and  passes  it  back  to  Alice.  Now 
the  circuit  is  extended  to  Carol,  and  Alice  and  Carol  share  a 
common  key  K2  =  gX2Vl . 

To  extend  the  circuit  to  a  third  node  or  beyond,  Alice  pro¬ 
ceeds  as  above,  always  telling  the  last  node  in  the  circuit  to 
extend  one  hop  further. 

This  circuit-level  handshake  protocol  achieves  unilateral 
entity  authentication  (Alice  knows  she’s  handshaking  with 
the  OR,  but  the  OR  doesn’t  care  who  is  opening  the  circuit — 
Alice  uses  no  public  key  and  remains  anonymous)  and  unilat¬ 
eral  key  authentication  (Alice  and  the  OR  agree  on  a  key,  and 
Alice  knows  only  the  OR  learns  it).  It  also  achieves  forward 
secrecy  and  key  freshness.  More  formally,  the  protocol  is  as 
follows  (where  EpKBob(')  is  encryption  with  Bob’s  public 
key,  H  is  a  secure  hash  function,  and  |  is  concatenation): 

Alice  ->  Bob  :  EPKBob(gx) 

Bob  — >  Alice  :  gv .  H(K\  “handshake”) 

In  the  second  step.  Bob  proves  that  it  was  he  who  received 
gx,  and  who  chose  y.  We  use  PK  encryption  in  the  first  step 
(rather  than,  say,  using  the  first  two  steps  of  STS,  which  has 
a  signature  in  the  second  step)  because  a  single  cell  is  too 
small  to  hold  both  a  public  key  and  a  signature.  Preliminary 
analysis  with  the  NRL  protocol  analyzer  [35]  shows  this 
protocol  to  be  secure  (including  perfect  forward  secrecy) 
under  the  traditional  Dolev-Yao  model. 

Relay  cells 

Once  Alice  has  established  the  circuit  (so  she  shares  keys  with 
each  OR  on  the  circuit),  she  can  send  relay  cells.  Upon  re¬ 
ceiving  a  relay  cell,  an  OR  looks  up  the  corresponding  circuit, 
and  decrypts  the  relay  header  and  payload  with  the  session 
key  for  that  circuit.  If  the  cell  is  headed  away  from  Alice  the 
OR  then  checks  whether  the  decrypted  cell  has  a  valid  digest 
(as  an  optimization,  the  first  two  bytes  of  the  integrity  check 
are  zero,  so  in  most  cases  we  can  avoid  computing  the  hash). 
If  valid,  it  accepts  the  relay  cell  and  processes  it  as  described 
below.  Otherwise,  the  OR  looks  up  the  circID  and  OR  for  the 
next  step  in  the  circuit,  replaces  the  circID  as  appropriate,  and 
sends  the  decrypted  relay  cell  to  the  next  OR.  (If  the  OR  at 
the  end  of  the  circuit  receives  an  unrecognized  relay  cell,  an 
error  has  occurred,  and  the  circuit  is  torn  down.) 

1  Actually,  the  negotiated  key  is  used  to  derive  two  symmetric  keys:  one 
for  each  direction. 


OPs  treat  incoming  relay  cells  similarly:  they  iteratively 
unwrap  the  relay  header  and  payload  with  the  session  keys 
shared  with  each  OR  on  the  circuit,  from  the  closest  to  far¬ 
thest.  If  at  any  stage  the  digest  is  valid,  the  cell  must  have 
originated  at  the  OR  whose  encryption  has  just  been  removed. 

To  construct  a  relay  cell  addressed  to  a  given  OR,  Alice  as¬ 
signs  the  digest,  and  then  iteratively  encrypts  the  cell  payload 
(that  is,  the  relay  header  and  payload)  with  the  symmetric  key 
of  each  hop  up  to  that  OR.  Because  the  digest  is  encrypted  to 
a  different  value  at  each  step,  only  at  the  targeted  OR  will 
it  have  a  meaningful  value.2  This  leaky  pipe  circuit  topol¬ 
ogy  allows  Alice’s  streams  to  exit  at  different  ORs  on  a  sin¬ 
gle  circuit.  Alice  may  choose  different  exit  points  because  of 
their  exit  policies,  or  to  keep  the  ORs  from  knowing  that  two 
streams  originate  from  the  same  person. 

When  an  OR  later  replies  to  Alice  with  a  relay  cell,  it  en¬ 
crypts  the  cell’s  relay  header  and  payload  with  the  single  key 
it  shares  with  Alice,  and  sends  the  cell  back  toward  Alice 
along  the  circuit.  Subsequent  ORs  add  further  layers  of  en¬ 
cryption  as  they  relay  the  cell  back  to  Alice. 

To  tear  down  a  circuit,  Alice  sends  a  destroy  control  cell. 
Each  OR  in  the  circuit  receives  the  destroy  cell,  closes  all 
streams  on  that  circuit,  and  passes  a  new  destroy  cell  forward. 
But  just  as  circuits  are  built  incrementally,  they  can  also  be 
torn  down  incrementally:  Alice  can  send  a  relay  truncate  cell 
to  a  single  OR  on  a  circuit.  That  OR  then  sends  a  destroy  cell 
forward,  and  acknowledges  with  a  relay  truncated  cell.  Alice 
can  then  extend  the  circuit  to  different  nodes,  without  signal¬ 
ing  to  the  intermediate  nodes  (or  a  limited  observer)  that  she 
has  changed  her  circuit.  Similarly,  if  a  node  on  the  circuit 
goes  down,  the  adjacent  node  can  send  a  relay  truncated  cell 
back  to  Alice.  Thus  the  “break  a  node  and  see  which  circuits 
go  down”  attack  [4]  is  weakened. 

4.3  Opening  and  closing  streams 

When  Alice’s  application  wants  a  TCP  connection  to  a  given 
address  and  port,  it  asks  the  OP  (via  SOCKS)  to  make  the 
connection.  The  OP  chooses  the  newest  open  circuit  (or  cre¬ 
ates  one  if  needed),  and  chooses  a  suitable  OR  on  that  circuit 
to  be  the  exit  node  (usually  the  last  node,  but  maybe  others 
due  to  exit  policy  conflicts;  see  Section  6.2.)  The  OP  then 
opens  the  stream  by  sending  a  relay  begin  cell  to  the  exit  node, 
using  a  new  random  streamID.  Once  the  exit  node  connects 
to  the  remote  host,  it  responds  with  a  relay  connected  cell. 
Upon  receipt,  the  OP  sends  a  SOCKS  reply  to  notify  the  ap¬ 
plication  of  its  success.  The  OP  now  accepts  data  from  the 
application’s  TCP  stream,  packaging  it  into  relay  data  cells 
and  sending  those  cells  along  the  circuit  to  the  chosen  OR. 

There’s  a  catch  to  using  SOCKS,  however — some  applica¬ 
tions  pass  the  alphanumeric  hostname  to  the  Tor  client,  while 
others  resolve  it  into  an  IP  address  first  and  then  pass  the  IP 

-With  48  bits  of  digest  per  cell,  the  probability  of  an  accidental  collision 
is  far  lower  than  the  chance  of  hardware  failure. 


address  to  the  Tor  client.  If  the  application  does  DNS  resolu¬ 
tion  first,  Alice  thereby  reveals  her  destination  to  the  remote 
DNS  server,  rather  than  sending  the  hostname  through  the  Tor 
network  to  be  resolved  at  the  far  end.  Common  applications 
like  Mozilla  and  SSH  have  this  flaw. 

With  Mozilla,  the  flaw  is  easy  to  address:  the  filtering 
HTTP  proxy  called  Privoxy  gives  a  hostname  to  the  Tor 
client,  so  Alice’s  computer  never  does  DNS  resolution.  But 
a  portable  general  solution,  such  as  is  needed  for  SSH,  is  an 
open  problem.  Modifying  or  replacing  the  local  nameserver 
can  be  invasive,  brittle,  and  importable.  Forcing  the  resolver 
library  to  prefer  TCP  rather  than  UDP  is  hard,  and  also  has 
portability  problems.  Dynamically  intercepting  system  calls 
to  the  resolver  library  seems  a  promising  direction.  We  could 
also  provide  a  tool  similar  to  dig  to  perform  a  private  lookup 
through  the  Tor  network.  Currently,  we  encourage  the  use  of 
privacy-aware  proxies  like  Privoxy  wherever  possible. 

Closing  a  Tor  stream  is  analogous  to  closing  a  TCP  stream: 
it  uses  a  two-step  handshake  for  normal  operation,  or  a  one- 
step  handshake  for  errors.  If  the  stream  closes  abnormally, 
the  adjacent  node  simply  sends  a  relay  teardown  cell.  If  the 
stream  closes  normally,  the  node  sends  a  relay  end  cell  down 
the  circuit,  and  the  other  side  responds  with  its  own  relay  end 
cell.  Because  all  relay  cells  use  layered  encryption,  only  the 
destination  OR  knows  that  a  given  relay  cell  is  a  request  to 
close  a  stream.  This  two-step  handshake  allows  Tor  to  support 
TCP-based  applications  that  use  half-closed  connections. 

4.4  Integrity  checking  on  streams 

Because  the  old  Onion  Routing  design  used  a  stream  cipher 
without  integrity  checking,  traffic  was  vulnerable  to  a  mal¬ 
leability  attack:  though  the  attacker  could  not  decrypt  cells, 
any  changes  to  encrypted  data  would  create  corresponding 
changes  to  the  data  leaving  the  network.  This  weakness  al¬ 
lowed  an  adversary  who  could  guess  the  encrypted  content  to 
change  a  padding  cell  to  a  destroy  cell;  change  the  destination 
address  in  a  relay  begin  cell  to  the  adversary’s  Webserver;  or 
change  an  FTP  command  from  dir  to  rm  *.  (Even  an  ex¬ 
ternal  adversary  could  do  this,  because  the  link  encryption 
similarly  used  a  stream  cipher.) 

Because  Tor  uses  TLS  on  its  links,  external  adversaries 
cannot  modify  data.  Addressing  the  insider  malleability  at¬ 
tack,  however,  is  more  complex. 

We  could  do  integrity  checking  of  the  relay  cells  at  each 
hop,  either  by  including  hashes  or  by  using  an  authenticating 
cipher  mode  like  EAX  [6],  but  there  are  some  problems.  First, 
these  approaches  impose  a  message-expansion  overhead  at 
each  hop,  and  so  we  would  have  to  either  leak  the  path  length 
or  waste  bytes  by  padding  to  a  maximum  path  length.  Sec¬ 
ond,  these  solutions  can  only  verify  traffic  coming  from  Al¬ 
ice:  ORs  would  not  be  able  to  produce  suitable  hashes  for 
the  intermediate  hops,  since  the  ORs  on  a  circuit  do  not  know 
the  other  ORs’  session  keys.  Third,  we  have  already  accepted 


that  our  design  is  vulnerable  to  end-to-end  timing  attacks;  so 
tagging  attacks  performed  within  the  circuit  provide  no  addi¬ 
tional  information  to  the  attacker. 

Thus,  we  check  integrity  only  at  the  edges  of  each  stream. 
(Remember  that  in  our  leaky-pipe  circuit  topology,  a  stream’s 
edge  could  be  any  hop  in  the  circuit.)  When  Alice  negotiates 
a  key  with  a  new  hop,  they  each  initialize  a  SHA-1  digest  with 
a  derivative  of  that  key,  thus  beginning  with  randomness  that 
only  the  two  of  them  know.  Then  they  each  incrementally 
add  to  the  SHA-1  digest  the  contents  of  all  relay  cells  they 
create,  and  include  with  each  relay  cell  the  first  four  bytes  of 
the  current  digest.  Each  also  keeps  a  SHA-1  digest  of  data 
received,  to  verify  that  the  received  hashes  are  correct. 

To  be  sure  of  removing  or  modifying  a  cell,  the  attacker 
must  be  able  to  deduce  the  current  digest  state  (which  de¬ 
pends  on  all  traffic  between  Alice  and  Bob,  starting  with  their 
negotiated  key).  Attacks  on  SHA-1  where  the  adversary  can 
incrementally  add  to  a  hash  to  produce  a  new  valid  hash  don’t 
work,  because  all  hashes  are  end-to-end  encrypted  across  the 
circuit.  The  computational  overhead  of  computing  the  digests 
is  minimal  compared  to  doing  the  AES  encryption  performed 
at  each  hop  of  the  circuit.  We  use  only  four  bytes  per  cell 
to  minimize  overhead;  the  chance  that  an  adversary  will  cor¬ 
rectly  guess  a  valid  hash  is  acceptably  low,  given  that  the  OP 
or  OR  tear  down  the  circuit  if  they  receive  a  bad  hash. 


4.5  Rate  limiting  and  fairness 

Volunteers  are  more  willing  to  run  services  that  can  limit 
their  bandwidth  usage.  To  accommodate  them.  Tor  servers 
use  a  token  bucket  approach  [50]  to  enforce  a  long-term  aver¬ 
age  rate  of  incoming  bytes,  while  still  permitting  short-term 
bursts  above  the  allowed  bandwidth. 

Because  the  Tor  protocol  outputs  about  the  same  number 
of  bytes  as  it  takes  in,  it  is  sufficient  in  practice  to  limit  only 
incoming  bytes.  With  TCP  streams,  however,  the  correspon¬ 
dence  is  not  one-to-one:  relaying  a  single  incoming  byte  can 
require  an  entire  512-byte  cell.  (We  can’t  just  wait  for  more 
bytes,  because  the  local  application  may  be  awaiting  a  reply.) 
Therefore,  we  treat  this  case  as  if  the  entire  cell  size  had  been 
read,  regardless  of  the  cell’s  fullness. 

Further,  inspired  by  Rennhard  et  al’s  design  in  [44],  a  cir¬ 
cuit’s  edges  can  heuristically  distinguish  interactive  streams 
from  bulk  streams  by  comparing  the  frequency  with  which 
they  supply  cells.  We  can  provide  good  latency  for  interactive 
streams  by  giving  them  preferential  service,  while  still  giving 
good  overall  throughput  to  the  bulk  streams.  Such  prefer¬ 
ential  treatment  presents  a  possible  end-to-end  attack,  but  an 
adversary  observing  both  ends  of  the  stream  can  already  learn 
this  information  through  timing  attacks. 


4.6  Congestion  control 


Even  with  bandwidth  rate  limiting,  we  still  need  to  worry 
about  congestion,  either  accidental  or  intentional.  If  enough 
users  choose  the  same  OR-to-OR  connection  for  their  cir¬ 
cuits,  that  connection  can  become  saturated.  For  example, 
an  attacker  could  send  a  large  file  through  the  Tor  network 
to  a  Webserver  he  runs,  and  then  refuse  to  read  any  of  the 
bytes  at  the  Webserver  end  of  the  circuit.  Without  some  con¬ 
gestion  control  mechanism,  these  bottlenecks  can  propagate 
back  through  the  entire  network.  We  don’t  need  to  reimple¬ 
ment  full  TCP  windows  (with  sequence  numbers,  the  abil¬ 
ity  to  drop  cells  when  we’re  full  and  retransmit  later,  and 
so  on),  because  TCP  already  guarantees  in-order  delivery  of 
each  cell.  We  describe  our  response  below. 

Circuit-level  throttling:  To  control  a  circuit’s  bandwidth 
usage,  each  OR  keeps  track  of  two  windows.  The  packaging 
window  tracks  how  many  relay  data  cells  the  OR  is  allowed  to 
package  (from  incoming  TCP  streams)  for  transmission  back 
to  the  OP,  and  the  delivery  window  tracks  how  many  relay 
data  cells  it  is  willing  to  deliver  to  TCP  streams  outside  the 
network.  Each  window  is  initialized  (say,  to  1000  data  cells). 
When  a  data  cell  is  packaged  or  delivered,  the  appropriate 
window  is  decremented.  When  an  OR  has  received  enough 
data  cells  (currently  100),  it  sends  a  relay  sendme  cell  towards 
the  OP,  with  streamID  zero.  When  an  OR  receives  a  relay 
sendme  cell  with  streamID  zero,  it  increments  its  packaging 
window.  Either  of  these  cells  increments  the  corresponding 
window  by  100.  If  the  packaging  window  reaches  0,  the  OR 
stops  reading  from  TCP  connections  for  all  streams  on  the 
corresponding  circuit,  and  sends  no  more  relay  data  cells  until 
receiving  a  relay  sendme  cell. 

The  OP  behaves  identically,  except  that  it  must  track  a 
packaging  window  and  a  delivery  window  for  every  OR  in 
the  circuit.  If  a  packaging  window  reaches  0,  it  stops  reading 
from  streams  destined  for  that  OR. 

Stream-level  throttling:  The  stream-level  congestion  con¬ 
trol  mechanism  is  similar  to  the  circuit-level  mechanism.  ORs 
and  OPs  use  relay  sendme  cells  to  implement  end-to-end  flow 
control  for  individual  streams  across  circuits.  Each  stream 
begins  with  a  packaging  window  (currently  500  cells),  and 
increments  the  window  by  a  fixed  value  (50)  upon  receiv¬ 
ing  a  relay  sendme  cell.  Rather  than  always  returning  a  relay 
sendme  cell  as  soon  as  enough  cells  have  arrived,  the  stream- 
level  congestion  control  also  has  to  check  whether  data  has 
been  successfully  flushed  onto  the  TCP  stream;  it  sends  the 
relay  sendme  cell  only  when  the  number  of  bytes  pending  to 
be  flushed  is  under  some  threshold  (currently  10  cells’  worth). 

These  arbitrarily  chosen  parameters  seem  to  give  tolerable 
throughput  and  delay;  see  Section  8. 


5  Rendezvous  Points  and  hidden  services 

Rendezvous  points  are  a  building  block  for  location-hidden 
sendees  (also  known  as  responder  anonymity )  in  the  Tor  net¬ 
work.  Location-hidden  services  allow  Bob  to  offer  a  TCP  ser¬ 
vice,  such  as  a  Webserver,  without  revealing  his  IP  address. 
This  type  of  anonymity  protects  against  distributed  DoS  at¬ 
tacks:  attackers  are  forced  to  attack  the  onion  routing  network 
because  they  do  not  know  Bob’s  IP  address. 

Our  design  for  location-hidden  servers  has  the  following 
goals.  Access-control:  Bob  needs  a  way  to  filter  incoming 
requests,  so  an  attacker  cannot  flood  Bob  simply  by  mak¬ 
ing  many  connections  to  him.  Robustness:  Bob  should  be 
able  to  maintain  a  long-term  pseudonymous  identity  even  in 
the  presence  of  router  failure.  Bob’s  service  must  not  be  tied 
to  a  single  OR,  and  Bob  must  be  able  to  migrate  his  service 
across  ORs.  Smear-resistance:  A  social  attacker  should  not 
be  able  to  “frame”  a  rendezvous  router  by  offering  an  ille¬ 
gal  or  disreputable  location-hidden  service  and  making  ob¬ 
servers  believe  the  router  created  that  service.  Application- 
transparency:  Although  we  require  users  to  run  special  soft¬ 
ware  to  access  location-hidden  servers,  we  must  not  require 
them  to  modify  their  applications. 

We  provide  location-hiding  for  Bob  by  allowing  him  to 
advertise  several  onion  routers  (his  introduction  points )  as 
contact  points.  He  may  do  this  on  any  robust  efficient  key- 
value  lookup  system  with  authenticated  updates,  such  as  a 
distributed  hash  table  (DHT)  like  CFS  [1 1].3  Alice,  the  client, 
chooses  an  OR  as  her  rendezvous  point.  She  connects  to  one 
of  Bob’s  introduction  points,  informs  him  of  her  rendezvous 
point,  and  then  waits  for  him  to  connect  to  the  rendezvous 
point.  This  extra  level  of  indirection  helps  Bob’s  introduc¬ 
tion  points  avoid  problems  associated  with  serving  unpopular 
files  directly  (for  example,  if  Bob  serves  material  that  the  in¬ 
troduction  point’s  community  finds  objectionable,  or  if  Bob’s 
service  tends  to  get  attacked  by  network  vandals).  The  ex¬ 
tra  level  of  indirection  also  allows  Bob  to  respond  to  some 
requests  and  ignore  others. 

5.1  Rendezvous  points  in  Tor 

The  following  steps  are  performed  on  behalf  of  Alice  and  Bob 
by  their  local  OPs;  application  integration  is  described  more 
fully  below. 

•  Bob  generates  a  long-term  public  key  pair  to  identify  his 
service. 

•  Bob  chooses  some  introduction  points,  and  advertises 
them  on  the  lookup  service,  signing  the  advertisement 
with  his  public  key.  He  can  add  more  later. 

•  Bob  builds  a  circuit  to  each  of  his  introduction  points, 
and  tells  them  to  wait  for  requests. 

3  Rather  than  rely  on  an  external  infrastructure,  the  Onion  Routing  net¬ 
work  can  ran  the  lookup  service  itself.  Our  current  implementation  provides 
a  simple  lookup  system  on  the  directory  servers. 


•  Alice  learns  about  Bob’s  service  out  of  band  (perhaps 
Bob  told  her,  or  she  found  it  on  a  website).  She  retrieves 
the  details  of  Bob’s  service  from  the  lookup  service.  If 
Alice  wants  to  access  Bob’s  service  anonymously,  she 
must  connect  to  the  lookup  service  via  Tor. 

•  Alice  chooses  an  OR  as  the  rendezvous  point  (RP)  for 
her  connection  to  Bob’s  service.  She  builds  a  circuit 
to  the  RP,  and  gives  it  a  randomly  chosen  “rendezvous 
cookie”  to  recognize  Bob. 

•  Alice  opens  an  anonymous  stream  to  one  of  Bob’s  intro¬ 
duction  points,  and  gives  it  a  message  (encrypted  with 
Bob’s  public  key)  telling  it  about  herself,  her  RP  and  ren¬ 
dezvous  cookie,  and  the  start  of  a  DH  handshake.  The 
introduction  point  sends  the  message  to  Bob. 

•  If  Bob  wants  to  talk  to  Alice,  he  builds  a  circuit  to  Al¬ 
ice’s  RP  and  sends  the  rendezvous  cookie,  the  second 
half  of  the  DH  handshake,  and  a  hash  of  the  session  key 
they  now  share.  By  the  same  argument  as  in  Section  4.2, 
Alice  knows  she  shares  the  key  only  with  Bob. 

•  The  RP  connects  Alice’s  circuit  to  Bob’s.  Note  that  RP 
can’t  recognize  Alice,  Bob,  or  the  data  they  transmit. 

•  Alice  sends  a  relay  begin  cell  along  the  circuit.  It  arrives 
at  Bob’s  OP,  which  connects  to  Bob’s  Webserver. 

•  An  anonymous  stream  has  been  established,  and  Alice 
and  Bob  communicate  as  normal. 


When  establishing  an  introduction  point.  Bob  provides  the 
onion  router  with  the  public  key  identifying  his  service.  Bob 
signs  his  messages,  so  others  cannot  usurp  his  introduction 
point  in  the  future.  He  uses  the  same  public  key  to  establish 
the  other  introduction  points  for  his  service,  and  periodically 
refreshes  his  entry  in  the  lookup  service. 

The  message  that  Alice  gives  the  introduction  point  in¬ 
cludes  a  hash  of  Bob’s  public  key  and  an  optional  initial  au¬ 
thorization  token  (the  introduction  point  can  do  prescreening, 
for  example  to  block  replays).  Her  message  to  Bob  may  in¬ 
clude  an  end-to-end  authorization  token  so  Bob  can  choose 
whether  to  respond.  The  authorization  tokens  can  be  used 
to  provide  selective  access:  important  users  can  get  uninter¬ 
rupted  access.  During  normal  situations,  Bob’s  service  might 
simply  be  offered  directly  from  mirrors,  while  Bob  gives 
out  tokens  to  high-priority  users.  If  the  mirrors  are  knocked 
down,  those  users  can  switch  to  accessing  Bob’s  service  via 
the  Tor  rendezvous  system. 

Bob’s  introduction  points  are  themselves  subject  to  DoS — 
he  must  open  many  introduction  points  or  risk  such  an  at¬ 
tack.  He  can  provide  selected  users  with  a  current  list  or  fu¬ 
ture  schedule  of  unadvertised  introduction  points;  this  is  most 
practical  if  there  is  a  stable  and  large  group  of  introduction 
points  available.  Bob  could  also  give  secret  public  keys  for 
consulting  the  lookup  service.  All  of  these  approaches  limit 
exposure  even  when  some  selected  users  collude  in  the  DoS. 


5.2  Integration  with  user  applications 

Bob  configures  his  onion  proxy  to  know  the  local  IP  address 
and  port  of  his  service,  a  strategy  for  authorizing  clients,  and 
his  public  key.  The  onion  proxy  anonymously  publishes  a 
signed  statement  of  Bob’s  public  key,  an  expiration  time,  and 
the  current  introduction  points  for  his  service  onto  the  lookup 
service,  indexed  by  the  hash  of  his  public  key.  Bob’s  Web¬ 
server  is  unmodified,  and  doesn’t  even  know  that  it’s  hidden 
behind  the  Tor  network. 

Alice’s  applications  also  work  unchanged — her  client 
interface  remains  a  SOCKS  proxy.  We  encode  all  of 
the  necessary  information  into  the  fully  qualified  domain 
name  (FQDN)  Alice  uses  when  establishing  her  connection. 
Location-hidden  services  use  a  virtual  top  level  domain  called 
.  onion:  thus  hostnames  take  the  form  x .  y  .  onion  where 
x  is  the  authorization  cookie  and  y  encodes  the  hash  of 
the  public  key.  Alice’s  onion  proxy  examines  addresses;  if 
they’re  destined  for  a  hidden  server,  it  decodes  the  key  and 
starts  the  rendezvous  as  described  above. 

5.3  Previous  rendezvous  work 

Rendezvous  points  in  low-latency  anonymity  systems  were 
first  described  for  use  in  ISDN  telephony  [30,  38].  Later  low- 
latency  designs  used  rendezvous  points  for  hiding  location 
of  mobile  phones  and  low -power  location  trackers  [23,  40]. 
Rendezvous  for  anonymizing  low-latency  Internet  connec¬ 
tions  was  suggested  in  early  Onion  Routing  work  [27],  but 
the  first  published  design  was  by  Ian  Goldberg  [26].  His  de¬ 
sign  differs  from  ours  in  three  ways.  First,  Goldberg  suggests 
that  Alice  should  manually  hunt  down  a  current  location  of 
the  service  via  Gnutella;  our  approach  makes  lookup  trans¬ 
parent  to  the  user,  as  well  as  faster  and  more  robust.  Second, 
in  Tor  the  client  and  server  negotiate  session  keys  with  Diffie- 
Hellman,  so  plaintext  is  not  exposed  even  at  the  rendezvous 
point.  Third,  our  design  minimizes  the  exposure  from  run¬ 
ning  the  service,  to  encourage  volunteers  to  offer  introduc¬ 
tion  and  rendezvous  services.  Tor’s  introduction  points  do  not 
output  any  bytes  to  the  clients;  the  rendezvous  points  don’t 
know  the  client  or  the  server,  and  can’t  read  the  data  being 
transmitted.  The  indirection  scheme  is  also  designed  to  in¬ 
clude  authentication/authorization — if  Alice  doesn’t  include 
the  right  cookie  with  her  request  for  service.  Bob  need  not 
even  acknowledge  his  existence. 

6  Other  design  decisions 
6.1  Denial  of  service 

Providing  Tor  as  a  public  service  creates  many  opportuni¬ 
ties  for  denial-of-service  attacks  against  the  network.  While 
flow  control  and  rate  limiting  (discussed  in  Section  4.6)  pre¬ 
vent  users  from  consuming  more  bandwidth  than  routers  are 


willing  to  provide,  opportunities  remain  for  users  to  consume 
more  network  resources  than  their  fair  share,  or  to  render  the 
network  unusable  for  others. 

First  of  all,  there  are  several  CPU-consuming  denial-of- 
service  attacks  wherein  an  attacker  can  force  an  OR  to  per¬ 
form  expensive  cryptographic  operations.  For  example,  an  at¬ 
tacker  can  fake  the  start  of  a  TLS  handshake,  forcing  the  OR 
to  carry  out  its  (comparatively  expensive)  half  of  the  hand¬ 
shake  at  no  real  computational  cost  to  the  attacker. 

We  have  not  yet  implemented  any  defenses  for  these  at¬ 
tacks,  but  several  approaches  are  possible.  First,  ORs  can 
require  clients  to  solve  a  puzzle  [16]  while  beginning  new 
TLS  handshakes  or  accepting  create  cells.  So  long  as  these 
tokens  are  easy  to  verify  and  computationally  expensive  to 
produce,  this  approach  limits  the  attack  multiplier.  Addition¬ 
ally,  ORs  can  limit  the  rate  at  which  they  accept  create  cells 
and  TLS  connections,  so  that  the  computational  work  of  pro¬ 
cessing  them  does  not  drown  out  the  symmeUic  cryptography 
operations  that  keep  cells  flowing.  This  rate  limiting  could, 
however,  allow  an  attacker  to  slow  down  other  users  when 
they  build  new  circuits. 

Adversaries  can  also  attack  the  Tor  network’s  hosts  and 
network  links.  Disrupting  a  single  circuit  or  link  breaks  all 
streams  passing  along  that  part  of  the  circuit.  Users  simi¬ 
larly  lose  service  when  a  router  crashes  or  its  operator  restarts 
it.  The  current  Tor  design  tteats  such  attacks  as  intermit¬ 
tent  network  failures,  and  depends  on  users  and  applications 
to  respond  or  recover  as  appropriate.  A  future  design  could 
use  an  end-to-end  TCP-like  acknowledgment  protocol,  so  no 
streams  are  lost  unless  the  enUy  or  exit  point  is  disrupted. 
This  solution  would  require  more  buffering  at  the  network 
edges,  however,  and  the  performance  and  anonymity  impli¬ 
cations  from  this  extra  complexity  still  require  investigation. 

6.2  Exit  policies  and  abuse 

Exit  abuse  is  a  serious  barrier  to  wide-scale  Tor  deployment. 
Anonymity  presents  would-be  vandals  and  abusers  with  an 
opportunity  to  hide  the  origins  of  their  activities.  Attackers 
can  harm  the  Tor  network  by  implicating  exit  servers  for  their 
abuse.  Also,  applications  that  commonly  use  IP-based  au¬ 
thentication  (such  as  institutional  mail  or  webservers)  can  be 
fooled  by  the  fact  that  anonymous  connections  appear  to  orig¬ 
inate  at  the  exit  OR. 

We  stress  that  Tor  does  not  enable  any  new  class  of  abuse. 
Spammers  and  other  attackers  already  have  access  to  thou¬ 
sands  of  misconfigured  systems  worldwide,  and  the  Tor  net¬ 
work  is  far  from  the  easiest  way  to  launch  attacks.  But  be¬ 
cause  the  onion  routers  can  be  mistaken  for  the  originators 
of  the  abuse,  and  the  volunteers  who  run  them  may  not  want 
to  deal  with  the  hassle  of  explaining  anonymity  networks  to 
irate  administrators,  we  must  block  or  limit  abuse  through  the 
Tor  network. 

To  mitigate  abuse  issues,  each  onion  router’s  exit  policy  de¬ 


scribes  to  which  external  addresses  and  ports  the  router  will 
connect.  On  one  end  of  the  spectrum  are  open  exit  nodes 
that  will  connect  anywhere.  On  the  other  end  are  middleman 
nodes  that  only  relay  traffic  to  other  Tor  nodes,  and  private 
exit  nodes  that  only  connect  to  a  local  host  or  network.  A 
private  exit  can  allow  a  client  to  connect  to  a  given  host  or 
network  more  securely — an  external  adversary  cannot  eaves¬ 
drop  traffic  between  the  private  exit  and  the  final  destination, 
and  so  is  less  sure  of  Alice’s  destination  and  activities.  Most 
onion  routers  in  the  current  network  function  as  restricted  ex¬ 
its  that  permit  connections  to  the  world  at  large,  but  prevent 
access  to  certain  abuse-prone  addresses  and  services  such  as 
SMTP.  The  OR  might  also  be  able  to  authenticate  clients  to 
prevent  exit  abuse  without  harming  anonymity  [48]. 

Many  administtators  use  port  restrictions  to  support  only  a 
limited  set  of  services,  such  as  HTTP,  SSH,  or  AIM.  This  is 
not  a  complete  solution,  of  course,  since  abuse  opportunities 
for  these  protocols  are  still  well  known. 

We  have  not  yet  encountered  any  abuse  in  the  deployed 
network,  but  if  we  do  we  should  consider  using  proxies  to 
clean  traffic  for  certain  protocols  as  it  leaves  the  network.  For 
example,  much  abusive  HTTP  behavior  (such  as  exploiting 
buffer  overflows  or  well-known  script  vulnerabilities)  can  be 
detected  in  a  straightforward  manner.  Similarly,  one  could 
run  automatic  spam  filtering  software  (such  as  SpamAssas- 
sin)  on  email  exiting  the  OR  network. 

ORs  may  also  rewrite  exiting  traffic  to  append  headers 
or  other  information  indicating  that  the  traffic  has  passed 
through  an  anonymity  service.  This  approach  is  commonly 
used  by  email-only  anonymity  systems.  ORs  can  also  run 
on  servers  with  hostnames  like  anonymous  to  further  alert 
abuse  targets  to  the  nature  of  the  anonymous  traffic. 

A  mixture  of  open  and  restricted  exit  nodes  allows  the  most 
flexibility  for  volunteers  running  servers.  But  while  having 
many  middleman  nodes  provides  a  large  and  robust  network, 
having  only  a  few  exit  nodes  reduces  the  number  of  points  an 
adversary  needs  to  monitor  for  traffic  analysis,  and  places  a 
greater  burden  on  the  exit  nodes.  This  tension  can  be  seen  in 
the  Java  Anon  Proxy  cascade  model,  wherein  only  one  node 
in  each  cascade  needs  to  handle  abuse  complaints — but  an  ad¬ 
versary  only  needs  to  observe  the  entry  and  exit  of  a  cascade 
to  perform  traffic  analysis  on  all  that  cascade’s  users.  The  hy¬ 
dra  model  (many  entries,  few  exits)  presents  a  different  com¬ 
promise:  only  a  few  exit  nodes  are  needed,  but  an  adversary 
needs  to  work  harder  to  watch  all  the  clients;  see  Section  10. 

Finally,  we  note  that  exit  abuse  must  not  be  dismissed  as 
a  peripheral  issue:  when  a  system’s  public  image  suffers,  it 
can  reduce  the  number  and  diversity  of  that  system’s  users, 
and  thereby  reduce  the  anonymity  of  the  system  itself.  Like 
usability,  public  perception  is  a  security  parameter.  Sadly, 
preventing  abuse  of  open  exit  nodes  is  an  unsolved  problem, 
and  will  probably  remain  an  arms  race  for  the  foreseeable 
future.  The  abuse  problems  faced  by  Princeton’s  CoDeeN 
project  [37]  give  us  a  glimpse  of  likely  issues. 


6.3  Directory  Servers 

First-generation  Onion  Routing  designs  [8,  41]  used  in-band 
network  status  updates:  each  router  flooded  a  signed  state¬ 
ment  to  its  neighbors,  which  propagated  it  onward.  But 
anonymizing  networks  have  different  security  goals  than  typ¬ 
ical  link-state  routing  protocols.  For  example,  delays  (acci¬ 
dental  or  intentional)  that  can  cause  different  parts  of  the  net¬ 
work  to  have  different  views  of  link-state  and  topology  are 
not  only  inconvenient:  they  give  attackers  an  opportunity  to 
exploit  differences  in  client  knowledge.  We  also  worry  about 
attacks  to  deceive  a  client  about  the  router  membership  list, 
topology,  or  current  network  state.  Such  partitioning  attacks 
on  client  knowledge  help  an  adversary  to  efficiently  deploy 
resources  against  a  target  [15]. 

Tor  uses  a  small  group  of  redundant,  well-known  onion 
routers  to  track  changes  in  network  topology  and  node  state, 
including  keys  and  exit  policies.  Each  such  directory  server 
acts  as  an  HTTP  server,  so  clients  can  fetch  current  network 
state  and  router  lists,  and  so  other  ORs  can  upload  state  infor¬ 
mation.  Onion  routers  periodically  publish  signed  statements 
of  their  state  to  each  directory  server.  The  directory  servers 
combine  this  information  with  their  own  views  of  network 
liveness,  and  generate  a  signed  description  (a  directory )  of 
the  entire  network  state.  Client  software  is  pre-loaded  with  a 
list  of  the  directory  servers  and  their  keys,  to  bootstrap  each 
client’s  view  of  the  network. 

When  a  directory  server  receives  a  signed  statement  for  an 
OR,  it  checks  whether  the  OR’s  identity  key  is  recognized. 
Directory  servers  do  not  advertise  unrecognized  ORs — if  they 
did,  an  adversary  could  take  over  the  network  by  creating 
many  servers  [22].  Instead,  new  nodes  must  be  approved  by 
the  directory  server  administrator  before  they  are  included. 
Mechanisms  for  automated  node  approval  are  an  area  of  ac¬ 
tive  research,  and  are  discussed  more  in  Section  9. 

Of  course,  a  variety  of  attacks  remain.  An  adversary  who 
controls  a  directory  server  can  track  clients  by  providing  them 
different  information — perhaps  by  listing  only  nodes  under 
its  control,  or  by  informing  only  certain  clients  about  a  given 
node.  Even  an  external  adversary  can  exploit  differences  in 
client  knowledge:  clients  who  use  a  node  listed  on  one  direc¬ 
tory  server  but  not  the  others  are  vulnerable. 

Thus  these  directory  servers  must  be  synchronized  and 
redundant,  so  that  they  can  agree  on  a  common  directory. 
Clients  should  only  trust  this  directory  if  it  is  signed  by  a 
threshold  of  the  directory  servers. 

The  directory  servers  in  Tor  are  modeled  after  those  in 
Mixminion  [15],  but  our  situation  is  easier.  First,  we  make 
the  simplifying  assumption  that  all  participants  agree  on  the 
set  of  directory  servers.  Second,  while  Mixminion  needs 
to  predict  node  behavior.  Tor  only  needs  a  threshold  con¬ 
sensus  of  the  current  state  of  the  network.  Third,  we  as¬ 
sume  that  we  can  fall  back  to  the  human  administrators  to 
discover  and  resolve  problems  when  a  consensus  directory 


cannot  be  reached.  Since  there  are  relatively  few  directory 
servers  (currently  3,  but  we  expect  as  many  as  9  as  the  net¬ 
work  scales),  we  can  afford  operations  like  broadcast  to  sim¬ 
plify  the  consensus-building  protocol. 

To  avoid  attacks  where  a  router  connects  to  all  the  direc¬ 
tory  servers  but  refuses  to  relay  traffic  from  other  routers, 
the  directory  servers  must  also  build  circuits  and  use  them  to 
anonymously  test  router  reliability  [18].  Unfortunately,  this 
defense  is  not  yet  designed  or  implemented. 

Using  directory  servers  is  simpler  and  more  flexible  than 
flooding.  Flooding  is  expensive,  and  complicates  the  analysis 
when  we  start  experimenting  with  non-clique  network  topolo¬ 
gies.  Signed  directories  can  be  cached  by  other  onion  routers, 
so  directory  servers  are  not  a  performance  bottleneck  when 
we  have  many  users,  and  do  not  aid  traffic  analysis  by  forcing 
clients  to  announce  their  existence  to  any  central  point. 

7  Attacks  and  Defenses 

Below  we  summarize  a  variety  of  attacks,  and  discuss  how 
well  our  design  withstands  them. 

Passive  attacks 

Observing  user  traffic  patterns.  Observing  a  user’s  connec¬ 
tion  will  not  reveal  her  destination  or  data,  but  it  will  reveal 
traffic  patterns  (both  sent  and  received).  Profiling  via  user 
connection  patterns  requires  further  processing,  because  mul¬ 
tiple  application  streams  may  be  operating  simultaneously  or 
in  series  over  a  single  circuit. 

Observing  user  content.  While  content  at  the  user  end  is 
encrypted,  connections  to  responders  may  not  be  (indeed,  the 
responding  website  itself  may  be  hostile).  While  filtering 
content  is  not  a  primary  goal  of  Onion  Routing,  Tor  can  di¬ 
rectly  use  Privoxy  and  related  filtering  services  to  anonymize 
application  data  streams. 

Option  distinguishability.  We  allow  clients  to  choose  con¬ 
figuration  options.  For  example,  clients  concerned  about  re¬ 
quest  linkability  should  rotate  circuits  more  often  than  those 
concerned  about  traceability.  Allowing  choice  may  attract 
users  with  different  needs;  but  clients  who  are  in  the  minor¬ 
ity  may  lose  more  anonymity  by  appearing  distinct  than  they 
gain  by  optimizing  their  behavior  [1], 

End-to-end  timing  correlation.  Tor  only  minimally  hides 
such  correlations.  An  attacker  watching  patterns  of  traffic  at 
the  initiator  and  the  responder  will  be  able  to  confirm  the  cor¬ 
respondence  with  high  probability.  The  greatest  protection 
currently  available  against  such  confirmation  is  to  hide  the 
connection  between  the  onion  proxy  and  the  first  Tor  node, 
by  running  the  OP  on  the  Tor  node  or  behind  a  firewall.  This 
approach  requires  an  observer  to  separate  traffic  originating  at 
the  onion  router  from  traffic  passing  through  it:  a  global  ob¬ 
server  can  do  this,  but  it  might  be  beyond  a  limited  observer’s 
capabilities. 


End-to-end  size  correlation.  Simple  packet  counting  will 
also  be  effective  in  confirming  endpoints  of  a  stream.  How¬ 
ever,  even  without  padding,  we  may  have  some  limited  pro¬ 
tection:  the  leaky  pipe  topology  means  different  numbers  of 
packets  may  enter  one  end  of  a  circuit  than  exit  at  the  other. 

Website  fingerprinting.  All  the  effective  passive  attacks 
above  are  traffic  confirmation  attacks,  which  puts  them  out¬ 
side  our  design  goals.  There  is  also  a  passive  traffic  analysis 
attack  that  is  potentially  effective.  Rather  than  searching 
exit  connections  for  timing  and  volume  correlations,  the 
adversary  may  build  up  a  database  of  “fingerprints”  contain¬ 
ing  file  sizes  and  access  patterns  for  targeted  websites.  He 
can  later  confirm  a  user’s  connection  to  a  given  site  simply 
by  consulting  the  database.  This  attack  has  been  shown  to 
be  effective  against  Safe  Web  [29],  It  may  be  less  effective 
against  Tor,  since  streams  are  multiplexed  within  the  same 
circuit,  and  fingerprinting  will  be  limited  to  the  granularity 
of  cells  (currently  512  bytes).  Additional  defenses  could 
include  larger  cell  sizes,  padding  schemes  to  group  websites 
into  large  sets,  and  link  padding  or  long-range  dummies.4 

Active  attacks 

Compromise  keys.  An  attacker  who  learns  the  TLS  session 
key  can  see  control  cells  and  encrypted  relay  cells  on  every 
circuit  on  that  connection;  learning  a  circuit  session  key  lets 
him  unwrap  one  layer  of  the  encryption.  An  attacker  who 
learns  an  OR’s  TLS  private  key  can  impersonate  that  OR  for 
the  TLS  key’s  lifetime,  but  he  must  also  learn  the  onion  key 
to  decrypt  create  cells  (and  because  of  perfect  forward  se¬ 
crecy,  he  cannot  hijack  already  established  circuits  without 
also  compromising  their  session  keys).  Periodic  key  rotation 
limits  the  window  of  opportunity  for  these  attacks.  On  the 
other  hand,  an  attacker  who  learns  a  node’s  identity  key  can 
replace  that  node  indefinitely  by  sending  new  forged  descrip¬ 
tors  to  the  directory  servers. 

Iterated  compromise.  A  roving  adversary  who  can  com¬ 
promise  ORs  (by  system  intrusion,  legal  coercion,  or  extrale¬ 
gal  coercion)  could  march  down  the  circuit  compromising  the 
nodes  until  he  reaches  the  end.  Unless  the  adversary  can  com¬ 
plete  this  attack  within  the  lifetime  of  the  circuit,  however, 
the  ORs  will  have  discarded  the  necessary  information  before 
the  attack  can  be  completed.  (Thanks  to  the  perfect  forward 
secrecy  of  session  keys,  the  attacker  cannot  force  nodes  to  de¬ 
crypt  recorded  traffic  once  the  circuits  have  been  closed.)  Ad¬ 
ditionally,  building  circuits  that  cross  jurisdictions  can  make 
legal  coercion  harder — this  phenomenon  is  commonly  called 
“jurisdictional  arbitrage.”  The  Java  Anon  Proxy  project  re¬ 
cently  experienced  the  need  for  this  approach,  when  a  Ger¬ 
man  court  forced  them  to  add  a  backdoor  to  their  nodes  [51], 

Run  a  recipient.  An  adversary  running  a  Webserver  trivially 

4Note  that  this  fingerprinting  attack  should  not  be  confused  with  the  much 
more  complicated  latency  attacks  of  [5],  which  require  a  fingerprint  of  the 
latencies  of  all  circuits  through  the  network,  combined  with  those  from  the 
network  edges  to  the  target  user  and  the  responder  website. 


learns  the  timing  patterns  of  users  connecting  to  it,  and  can  in¬ 
troduce  arbitrary  patterns  in  its  responses.  End-to-end  attacks 
become  easier:  if  the  adversary  can  induce  users  to  connect 
to  his  Webserver  (perhaps  by  advertising  content  targeted  to 
those  users),  he  now  holds  one  end  of  their  connection.  There 
is  also  a  danger  that  application  protocols  and  associated  pro¬ 
grams  can  be  induced  to  reveal  information  about  the  initiator. 
Tor  depends  on  Privoxy  and  similar  protocol  cleaners  to  solve 
this  latter  problem. 

Run  an  onion  proxy.  It  is  expected  that  end  users  will  nearly 
always  run  their  own  local  onion  proxy.  However,  in  some 
settings,  it  may  be  necessary  for  the  proxy  to  run  remotely — 
typically,  in  institutions  that  want  to  monitor  the  activity  of 
those  connecting  to  the  proxy.  Compromising  an  onion  proxy 
compromises  all  future  connections  through  it. 

DoS  non-observed  nodes.  An  observer  who  can  only  watch 
some  of  the  Tor  network  can  increase  the  value  of  this  traffic 
by  attacking  non-observed  nodes  to  shut  them  down,  reduce 
their  reliability,  or  persuade  users  that  they  are  not  trustwor¬ 
thy.  The  best  defense  here  is  robustness. 

Run  a  hostile  OR.  In  addition  to  being  a  local  observer,  an 
isolated  hostile  node  can  create  circuits  through  itself,  or  alter 
traffic  patterns  to  affect  traffic  at  other  nodes.  Nonetheless,  a 
hostile  node  must  be  immediately  adjacent  to  both  endpoints 
to  compromise  the  anonymity  of  a  circuit.  If  an  adversary  can 
run  multiple  ORs,  and  can  persuade  the  directory  servers  that 
those  ORs  are  trustworthy  and  independent,  then  occasionally 
some  user  will  choose  one  of  those  ORs  for  the  start  and  an¬ 
other  as  the  end  of  a  circuit.  If  an  adversary  controls  m  >  1 
of  N  nodes,  he  can  correlate  at  most  (^)  of  the  traffic — 
although  an  adversary  could  still  attract  a  disproportionately 
large  amount  of  traffic  by  running  an  OR  with  a  permissive 
exit  policy,  or  by  degrading  the  reliability  of  other  routers. 

Introduce  timing  into  messages.  This  is  simply  a  stronger 
version  of  passive  timing  attacks  already  discussed  earlier. 

Tagging  attacks.  A  hostile  node  could  “tag”  a  cell  by  al¬ 
tering  it.  If  the  stream  were,  for  example,  an  unencrypted 
request  to  a  Web  site,  the  garbled  content  coming  out  at  the 
appropriate  time  would  confirm  the  association.  However,  in¬ 
tegrity  checks  on  cells  prevent  this  attack. 

Replace  contents  of  unauthenticated  protocols.  When  re¬ 
laying  an  unauthenticated  protocol  like  HTTR  a  hostile  exit 
node  can  impersonate  the  target  server.  Clients  should  prefer 
protocols  with  end-to-end  authentication. 

Replay  attacks.  Some  anonymity  protocols  are  vulnerable 
to  replay  attacks.  Tor  is  not;  replaying  one  side  of  a  hand¬ 
shake  will  result  in  a  different  negotiated  session  key,  and  so 
the  rest  of  the  recorded  session  can’t  be  used. 

Smear  attacks.  An  attacker  could  use  the  Tor  network  for 
socially  disapproved  acts,  to  bring  the  network  into  disrepute 
and  get  its  operators  to  shut  it  down.  Exit  policies  reduce 
the  possibilities  for  abuse,  but  ultimately  the  network  requires 
volunteers  who  can  tolerate  some  political  heat. 

Distribute  hostile  code.  An  attacker  could  trick  users 


into  running  subverted  Tor  software  that  did  not,  in  fact, 
anonymize  their  connections — or  worse,  could  trick  ORs 
into  running  weakened  software  that  provided  users  with 
less  anonymity.  We  address  this  problem  (but  do  not  solve  it 
completely)  by  signing  all  Tor  releases  with  an  official  public 
key,  and  including  an  entry  in  the  directory  that  lists  which 
versions  are  currently  believed  to  be  secure.  To  prevent  an 
attacker  from  subverting  the  official  release  itself  (through 
threats,  bribery,  or  insider  attacks),  we  provide  all  releases  in 
source  code  form,  encourage  source  audits,  and  frequently 
warn  our  users  never  to  trust  any  software  (even  from  us)  that 
comes  without  source. 

Directory  attacks 

Destroy  directory  servers.  If  a  few  directory  servers  disap¬ 
pear,  the  others  still  decide  on  a  valid  directory.  So  long 
as  any  directory  servers  remain  in  operation,  they  will  still 
broadcast  their  views  of  the  network  and  generate  a  consensus 
directory.  (If  more  than  half  are  destroyed,  this  directory  will 
not,  however,  have  enough  signatures  for  clients  to  use  it  au¬ 
tomatically;  human  intervention  will  be  necessary  for  clients 
to  decide  whether  to  trust  the  resulting  directory.) 

Subvert  a  directory  server.  By  taking  over  a  directory 
server,  an  attacker  can  partially  influence  the  final  directory. 
Since  ORs  are  included  or  excluded  by  majority  vote,  the  cor¬ 
rupt  directory  can  at  worst  cast  a  tie-breaking  vote  to  decide 
whether  to  include  marginal  ORs.  It  remains  to  be  seen  how 
often  such  marginal  cases  occur  in  practice. 

Subvert  a  majority  of  directory  servers.  An  adversary  who 
controls  more  than  half  the  directory  servers  can  include  as 
many  compromised  ORs  in  the  final  directory  as  he  wishes. 
We  must  ensure  that  directory  server  operators  are  indepen¬ 
dent  and  attack-resistant. 

Encourage  directory  server  dissent.  The  directory  agree¬ 
ment  protocol  assumes  that  directory  server  operators  agree 
on  the  set  of  directory  servers.  An  adversary  who  can  per¬ 
suade  some  of  the  directory  server  operators  to  distrust  one 
another  could  split  the  quorum  into  mutually  hostile  camps, 
thus  partitioning  users  based  on  which  directory  they  use.  Tor 
does  not  address  this  attack. 

Trick  the  directory  servers  into  listing  a  hostile  OR.  Our 
threat  model  explicitly  assumes  directory  server  operators 
will  be  able  to  filter  out  most  hostile  ORs. 

Convince  the  directories  that  a  malfunctioning  OR  is 
working.  In  the  current  Tor  implementation,  directory  servers 
assume  that  an  OR  is  running  correctly  if  they  can  start  a 
TLS  connection  to  it.  A  hostile  OR  could  easily  subvert  this 
test  by  accepting  TLS  connections  from  ORs  but  ignoring  all 
cells.  Directory  servers  must  actively  test  ORs  by  building 
circuits  and  streams  as  appropriate.  The  tradeoffs  of  a  similar 
approach  are  discussed  in  [18]. 

Attacks  against  rendezvous  points 

Make  many  introduction  requests.  An  attacker  could  try  to 


deny  Bob  service  by  flooding  his  introduction  points  with  re¬ 
quests.  Because  the  introduction  points  can  block  requests 
that  lack  authorization  tokens,  however.  Bob  can  restrict  the 
volume  of  requests  he  receives,  or  require  a  certain  amount  of 
computation  for  every  request  he  receives. 

Attack  an  introduction  point.  An  attacker  could  disrupt  a 
location-hidden  service  by  disabling  its  introduction  points. 
But  because  a  service’s  identity  is  attached  to  its  public  key, 
the  service  can  simply  re-advertise  itself  at  a  different  intro¬ 
duction  point.  Advertisements  can  also  be  done  secretly  so 
that  only  high-priority  clients  know  the  address  of  Bob’s  in¬ 
troduction  points  or  so  that  different  clients  know  of  different 
introduction  points.  This  forces  the  attacker  to  disable  all  pos¬ 
sible  introduction  points. 

Compromise  an  introduction  point.  An  attacker  who  con¬ 
trols  Bob’s  introduction  point  can  flood  Bob  with  introduction 
requests,  or  prevent  valid  introduction  requests  from  reaching 
him.  Bob  can  notice  a  flood,  and  close  the  circuit.  To  notice 
blocking  of  valid  requests,  however,  he  should  periodically 
test  the  introduction  point  by  sending  rendezvous  requests 
and  making  sure  he  receives  them. 

Compromise  a  rendezvous  point.  A  rendezvous  point  is  no 
more  sensitive  than  any  other  OR  on  a  circuit,  since  all  data 
passing  through  the  rendezvous  is  encrypted  with  a  session 
key  shared  by  Alice  and  Bob. 

8  Early  experiences:  Tor  in  the  Wild 

As  of  mid-May  2004,  the  Tor  network  consists  of  32  nodes 
(24  in  the  US,  8  in  Europe),  and  more  are  joining  each  week 
as  the  code  matures.  (For  comparison,  the  current  remailer 
network  has  about  40  nodes.)  Each  node  has  at  least  a 
768Kb/768Kb  connection,  and  many  have  10Mb.  The  num¬ 
ber  of  users  varies  (and  of  course,  it’s  hard  to  tell  for  sure),  but 
we  sometimes  have  several  hundred  users — administrators  at 
several  companies  have  begun  sending  their  entire  depart¬ 
ments’  web  traffic  through  Tor,  to  block  other  divisions  of 
their  company  from  reading  their  traffic.  Tor  users  have  re¬ 
ported  using  the  network  for  web  browsing,  FTP,  IRC,  AIM, 
Kazaa,  SSH,  and  recipient-anonymous  email  via  rendezvous 
points.  One  user  has  anonymously  set  up  a  Wiki  as  a  hidden 
service,  where  other  users  anonymously  publish  the  addresses 
of  their  hidden  services. 

Each  Tor  node  currently  processes  roughly  800,000  relay 
cells  (a  bit  under  half  a  gigabyte)  per  week.  On  average,  about 
80%  of  each  498-byte  payload  is  full  for  cells  going  back  to 
the  client,  whereas  about  40%  is  full  for  cells  coming  from  the 
client.  (The  difference  arises  because  most  of  the  network’s 
traffic  is  web  browsing.)  Interactive  traffic  like  SSH  brings 
down  the  average  a  lot — once  we  have  more  experience,  and 
assuming  we  can  resolve  the  anonymity  issues,  we  may  parti¬ 
tion  traffic  into  two  relay  cell  sizes:  one  to  handle  bulk  traffic 
and  one  for  interactive  traffic. 


Based  in  part  on  our  restrictive  default  exit  policy  (we  re¬ 
ject  SMTP  requests)  and  our  low  profile,  we  have  had  no 
abuse  issues  since  the  network  was  deployed  in  October  2003. 
Our  slow  growth  rate  gives  us  time  to  add  features,  resolve 
bugs,  and  get  a  feel  for  what  users  actually  want  from  an 
anonymity  system.  Even  though  having  more  users  would 
bolster  our  anonymity  sets,  we  are  not  eager  to  attract  the 
Kazaa  or  warez  communities — we  feel  that  we  must  build  a 
reputation  for  privacy,  human  rights,  research,  and  other  so¬ 
cially  laudable  activities. 

As  for  performance,  profiling  shows  that  Tor  spends  almost 
all  its  CPU  time  in  AES,  which  is  fast.  Current  latency  is 
attributable  to  two  factors.  First,  network  latency  is  critical: 
we  are  intentionally  bouncing  traffic  around  the  world  several 
times.  Second,  our  end-to-end  congestion  control  algorithm 
focuses  on  protecting  volunteer  servers  from  accidental  DoS 
rather  than  on  optimizing  performance.  To  quantify  these  ef¬ 
fects,  we  did  some  informal  tests  using  a  network  of  4  nodes 
on  the  same  machine  (a  heavily  loaded  1GHz  Athlon).  We 
downloaded  a  60  megabyte  file  from  debian  .  org  every  30 
minutes  for  54  hours  (108  sample  points).  It  arrived  in  about 
300  seconds  on  average,  compared  to  210s  for  a  direct  down¬ 
load.  We  ran  a  similar  test  on  the  production  Tor  network, 
fetching  the  front  page  of  cnn .  com  (55  kilobytes):  while 
a  direct  download  consistently  took  about  0.3s,  the  perfor¬ 
mance  through  Tor  varied.  Some  downloads  were  as  fast  as 
0.4s,  with  a  median  at  2.8s,  and  90%  finishing  within  5.3s.  It 
seems  that  as  the  network  expands,  the  chance  of  building  a 
slow  circuit  (one  that  includes  a  slow  or  heavily  loaded  node 
or  link)  is  increasing.  On  the  other  hand,  as  our  users  remain 
satisfied  with  this  increased  latency,  we  can  address  our  per¬ 
formance  incrementally  as  we  proceed  with  development. 

Although  Tor’s  clique  topology  and  full-visibility  directo¬ 
ries  present  scaling  problems,  we  still  expect  the  network  to 
support  a  few  hundred  nodes  and  maybe  10,000  users  before 
we’re  forced  to  become  more  distributed.  With  luck,  the  ex¬ 
perience  we  gain  running  the  current  topology  will  help  us 
choose  among  alternatives  when  the  time  comes. 

9  Open  Questions  in  Low-latency  Anonymity 

In  addition  to  the  non-goals  in  Section  3,  many  questions 
must  be  solved  before  we  can  be  confident  of  Tor’s  security. 

Many  of  these  open  issues  are  questions  of  balance.  For 
example,  how  often  should  users  rotate  to  fresh  circuits?  Fre¬ 
quent  rotation  is  inefficient,  expensive,  and  may  lead  to  inter¬ 
section  attacks  and  predecessor  attacks  [54],  but  infrequent 
rotation  makes  the  user’s  traffic  linkable.  Besides  opening 
fresh  circuits,  clients  can  also  exit  from  the  middle  of  the  cir¬ 
cuit,  or  truncate  and  re-extend  the  circuit.  More  analysis  is 
needed  to  determine  the  proper  tradeoff. 

How  should  we  choose  path  lengths?  If  Alice  always  uses 
two  hops,  then  both  ORs  can  be  certain  that  by  colluding  they 
will  learn  about  Alice  and  Bob.  In  our  current  approach,  Alice 


always  chooses  at  least  three  nodes  unrelated  to  herself  and 
her  destination.  Should  Alice  choose  a  random  path  length 
(e.g.  from  a  geometric  distribution)  to  foil  an  attacker  who 
uses  timing  to  learn  that  he  is  the  fifth  hop  and  thus  concludes 
that  both  Alice  and  the  responder  are  running  ORs? 

Throughout  this  paper,  we  have  assumed  that  end-to-end 
traffic  confirmation  will  immediately  and  automatically  de¬ 
feat  a  low-latency  anonymity  system.  Even  high-latency 
anonymity  systems  can  be  vulnerable  to  end-to-end  traffic 
confirmation,  if  the  traffic  volumes  are  high  enough,  and  if 
users’  habits  are  sufficiently  distinct  [14,  31].  Can  anything 
be  done  to  make  low-latency  systems  resist  these  attacks  as 
well  as  high-latency  systems?  Tor  already  makes  some  ef¬ 
fort  to  conceal  the  starts  and  ends  of  streams  by  wrapping 
long-range  control  commands  in  identical-looking  relay  cells. 
Link  padding  could  frustrate  passive  observers  who  count 
packets;  long-range  padding  could  work  against  observers 
who  own  the  first  hop  in  a  circuit.  But  more  research  remains 
to  find  an  efficient  and  practical  approach.  Volunteers  pre¬ 
fer  not  to  run  constant-bandwidth  padding;  but  no  convinc¬ 
ing  traffic  shaping  approach  has  been  specified.  Recent  work 
on  long-range  padding  [33]  shows  promise.  One  could  also 
try  to  reduce  correlation  in  packet  timing  by  batching  and  re¬ 
ordering  packets,  but  it  is  unclear  whether  this  could  improve 
anonymity  without  introducing  so  much  latency  as  to  render 
the  network  unusable. 

A  cascade  topology  may  better  defend  against  traffic  con¬ 
firmation  by  aggregating  users,  and  making  padding  and  mix¬ 
ing  more  affordable.  Does  the  hydra  topology  (many  input 
nodes,  few  output  nodes)  work  better  against  some  adver¬ 
saries?  Are  we  going  to  get  a  hydra  anyway  because  most 
nodes  will  be  middleman  nodes? 

Common  wisdom  suggests  that  Alice  should  run  her  own 
OR  for  best  anonymity,  because  traffic  coming  from  her  node 
could  plausibly  have  come  from  elsewhere.  How  much  mix¬ 
ing  does  this  approach  need?  Is  it  immediately  beneficial 
because  of  real-world  adversaries  that  can’t  observe  Alice’s 
router,  but  can  run  routers  of  their  own? 

To  scale  to  many  users,  and  to  prevent  an  attacker  from 
observing  the  whole  network,  it  may  be  necessary  to  support 
far  more  servers  than  Tor  currently  anticipates.  This  intro¬ 
duces  several  issues.  First,  if  approval  by  a  central  set  of  di¬ 
rectory  servers  is  no  longer  feasible,  what  mechanism  should 
be  used  to  prevent  adversaries  from  signing  up  many  collud¬ 
ing  servers?  Second,  if  clients  can  no  longer  have  a  complete 
picture  of  the  network,  how  can  they  perform  discovery  while 
preventing  attackers  from  manipulating  or  exploiting  gaps  in 
their  knowledge?  Third,  if  there  are  too  many  servers  for  ev¬ 
ery  server  to  constantly  communicate  with  every  other,  which 
non-clique  topology  should  the  network  use?  (Restricted- 
route  topologies  promise  comparable  anonymity  with  better 
scalability  [13],  but  whatever  topology  we  choose,  we  need 
some  way  to  keep  attackers  from  manipulating  their  posi¬ 
tion  within  it  [21].)  Fourth,  if  no  central  authority  is  track- 


ing  server  reliability,  how  do  we  stop  unreliable  servers  from 
making  the  network  unusable?  Fifth,  do  clients  receive  so 
much  anonymity  from  running  their  own  ORs  that  we  should 
expect  them  all  to  do  so  [1],  or  do  we  need  another  incentive 
structure  to  motivate  them?  Tarzan  and  MorphMix  present 
possible  solutions. 

When  a  Tor  node  goes  down,  all  its  circuits  (and  thus 
streams)  must  break.  Will  users  abandon  the  system  be¬ 
cause  of  this  brittleness?  How  well  does  the  method  in  Sec¬ 
tion  6.1  allow  streams  to  survive  node  failure?  If  affected 
users  rebuild  circuits  immediately,  how  much  anonymity  is 
lost?  It  seems  the  problem  is  even  worse  in  a  peer-to-peer 
environment — such  systems  don’t  yet  provide  an  incentive 
for  peers  to  stay  connected  when  they’re  done  retrieving  con¬ 
tent,  so  we  would  expect  a  higher  churn  rate. 

10  Future  Directions 

Tor  brings  together  many  innovations  into  a  unified  deploy¬ 
able  system.  The  next  immediate  steps  include: 

Scalability:  Tor’s  emphasis  on  deployability  and  design 
simplicity  has  led  us  to  adopt  a  clique  topology,  semi- 
centralized  directories,  and  a  full-network-visibility  model 
for  client  knowledge.  These  properties  will  not  scale  past 
a  few  hundred  servers.  Section  9  describes  some  promising 
approaches,  but  more  deployment  experience  will  be  helpful 
in  learning  the  relative  importance  of  these  bottlenecks. 

Bandwidth  classes:  This  paper  assumes  that  all  ORs  have 
good  bandwidth  and  latency.  We  should  instead  adopt  the 
MorphMix  model,  where  nodes  advertise  their  bandwidth 
level  (DSL,  Tl,  T3),  and  Alice  avoids  bottlenecks  by  choos¬ 
ing  nodes  that  match  or  exceed  her  bandwidth.  In  this  way 
DSL  users  can  usefully  join  the  Tor  network. 

Incentives:  Volunteers  who  run  nodes  are  rewarded  with 
publicity  and  possibly  better  anonymity  [1].  More  nodes 
means  increased  scalability,  and  more  users  can  mean  more 
anonymity.  We  need  to  continue  examining  the  incentive 
structures  for  participating  in  Tor.  Further,  we  need  to  ex¬ 
plore  more  approaches  to  limiting  abuse,  and  understand  why 
most  people  don’t  bother  using  privacy  systems. 

Cover  traffic:  Currently  Tor  omits  cover  traffic — its  costs 
in  performance  and  bandwidth  are  clear  but  its  security  ben¬ 
efits  are  not  well  understood.  We  must  pursue  more  research 
on  link-level  cover  traffic  and  long-range  cover  traffic  to  de¬ 
termine  whether  some  simple  padding  method  offers  provable 
protection  against  our  chosen  adversary. 

Caching  at  exit  nodes:  Perhaps  each  exit  node  should  run 
a  caching  web  proxy  [47],  to  improve  anonymity  for  cached 
pages  (Alice’s  request  never  leaves  the  Tor  network),  to  im¬ 
prove  speed,  and  to  reduce  bandwidth  cost.  On  the  other 
hand,  forward  security  is  weakened  because  caches  consti¬ 
tute  a  record  of  retrieved  files.  We  must  find  the  right  balance 
between  usability  and  security. 


Better  directory  distribution:  Clients  currently  download  a 
description  of  the  entire  network  every  15  minutes.  As  the 
state  grows  larger  and  clients  more  numerous,  we  may  need 
a  solution  in  which  clients  receive  incremental  updates  to  di¬ 
rectory  state.  More  generally,  we  must  find  more  scalable  yet 
practical  ways  to  distribute  up-to-date  snapshots  of  network 
status  without  introducing  new  attacks. 

Further  specification  review:  Our  public  byte-level  spec¬ 
ification  [20]  needs  external  review.  We  hope  that  as  Tor  is 
deployed,  more  people  will  examine  its  specification. 

Multisystem  interoperability:  We  are  currently  working 
with  the  designer  of  MorphMix  to  unify  the  specification  and 
implementation  of  the  common  elements  of  our  two  systems. 
So  far,  this  seems  to  be  relatively  straightforward.  Interop¬ 
erability  will  allow  testing  and  direct  comparison  of  the  two 
designs  for  trust  and  scalability. 

Wider-scale  deployment:  The  original  goal  of  Tor  was  to 
gain  experience  in  deploying  an  anonymizing  overlay  net¬ 
work,  and  learn  from  having  actual  users.  We  are  now  at  a 
point  in  design  and  development  where  we  can  start  deploy¬ 
ing  a  wider  network.  Once  we  have  many  actual  users,  we 
will  doubtlessly  be  better  able  to  evaluate  some  of  our  design 
decisions,  including  our  robustness/latency  tradeoffs,  our  per¬ 
formance  tradeoffs  (including  cell  size),  our  abuse-prevention 
mechanisms,  and  our  overall  usability. 
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